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1. INTRODUCTION 

Tobacco kills a large number of its users when used as prescribed by the manufacturers. Globally, 
six million people die annually from tobacco use, and estimated to reach eight million by 2030 [1]. Research 
shows that women and children are the most affected by the drug [2]. Measures to control the epidemic have 
been limited to developed countries. 

Whereas tobacco use appears to be declining in many developed countries, it is increasing in the 
developing world [3], in part due to the economic growth in some developing countries. Consequently, the 
tobacco industry is capitalizing on the economic growth in these countries in terms of tobacco 
advertisements [4]. African countries have lower rates of tobacco taxation, and less stringent tobacco 
advertising restrictions in comparison to higher income countries [5]. Increasing tobacco taxes by 10% 
generally decreases tobacco consumption by about 8% in low-income and middle-income countries. Many 
Sub-Saharan countries have failed to allocate the appropriate financial resources to tobacco prevention 
programs, despite the cost-effectiveness of such programs [6]. 

Tobacco use is prevalent in Ghana. The common forms of usage include not only cigarette smoking, 
but also pipe smoking, chewing, sniffing and oral or nasal use [7]. Until recently, there were no studies 
showing tobacco prevalence for the whole country [8]. The rates of smoking among the general population in 
Ghana are relatively low, with males smoking more than females [9]. Earlier models of the tobacco epidemic 
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failed to explain the long delay between increase in smoking and the increase in smoking-related mortality in 
developing countries like Ghana [10]. Stage one epidemic is characterized by higher prevalence of smoking 
among males than females. However, the World Health Organization [11] reported higher rates among the 
youth, suggesting a rise in the future use. An understanding of smoking patterns among adolescents can shed 
light on trends in smoking initiation [12]. Growth in the Ghanaian tobacco industry is a concern. The growth 
has led to the industry targeting the population especially the youth. For example, research shows that 
tobacco advertisement, permission to smoke on school compounds, and parental smoking were associated 
with students’ intentions to smoke [13]. 

The Government of Ghana and Civil Society Organizations (CSOs) are putting in place measures to 
reduce the prevalence of tobacco use in the country. First, CSOs have reached an agreement with the 
National Development Planning Commission (NDPC) for the inclusion of the Tobacco Control Measures of 
the Public Health Act in the 40-year development agenda [14]. CSOs who are committed to reducing tobacco 
addiction and its related diseases and death. School smoking policies and parental smoking behaviors are 
important in reducing smoking intentions among Ghanaian youth [13]. Furthermore, the literature shows that 
smokers are likely to have friends who smoke [15]. 

Research suggests that the rate of smoking among Ghanaian youth could increase in the future. An 
understanding of the smoking tendencies among the youth in Ghana is warranted. The objectives of the 
present study included: (1) To recommend an appropriate predictive model to inform policy on tobacco use 
and control in Ghana and (2)To determine predictors of smoking tendencies among junior high school 
students (JHS) in Ghana. An understanding of the predictors and appropriate predictive model for smoking 
tendencies among students would allow public health policymakers and administrators, community leaders, 
and school administrators to develop effective smoking prevention programs [13], aimed at reducing 
smoking tendencies among the youth. 


2. RESEARCH METHOD 

The Global Youth Tobacco Survey (GYTS) [16] data on Ghana for 2009 served as the data source. 
The GYTS is a school-based survey that helps countries monitor tobacco use among youth. The 2009 data 
are the most current tobacco survey data for Ghana to date. The GYTS also guides in the implementation and 
evaluation of tobacco control and prevention programs. The 2009 (8,295) observations and many variables. 
Of the 8,295 observations, only 5971 (72%) were analyzed. Approximately 28% (2324) of the observations 
were discarded due to missing data. 


2.1. Variables 

The variables used were as follow: The response variable was “Whether a student had ever tried or 
experimented with cigarette smoking, even one or two puffs?” Thirty-two (32) predictors were used (refer to 
appendix). Before fitting the logit model, the GYST dataset was divided into two sets: Training and testing. 
The training set which constituted 70% of the entire dataset was used to specify the predictive model and the 
remaining 30% was used as a testing set to perform out-of-sample validation of the predictive accuracy of the 
model. 


2.2. Logit model 
Let Y; represent random variables taking values y; E€ (0,1) with probability 1— mrt; and 7; 
respectively. This implies: 


fe, i) = T7 (1 — a7"). (1) 


Suppose the logit of probability of 7; is given by 


Ti T eXtB 
og) eS ay (2) 
The predictive form of (2) can be written as: 
1 
1 (B|X;) = —=, (3) 
BX; eth 


where X; and fare vectors of predictors and slopes respectively. 
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To estimate (2), the loglikelihood is given by: 
1B) = log Ti- fr; (% )), 


=) pilog(a) + (1 - y)log(1 - 1:)), 


[ylog(m(B1X;)) + (1 — y,log(1 — (B1X;))]. 


n 
i=1 


2 
Let VI(B) = a and H(B) = = A represent the gradient of the loglikelihood and the hessian matrix 
k kOPm 


respectively. No closed-form solution, so resort to Newton Raphson’s algorithm for approximate solution. 
Newton Raphson’s algorithm [17] is given as follows: 


1. Start from some initial values >). 


2. Set B, = B, ,H(B,_,) VUB,_,) (a loop). 
3. Repeat step 2 until p, is close to f,_,. 





Inference is carried out as follows: 
vn(B — L) > NO,I(B)*) asn > os, 
where I(8) = —H (£) is fisher information and I(@)~* is the variance-covariance matrix. 


2.2.1. Variable selection-forward stepwise selection 

We begin with the intercept model then add predictors one at a time until all predictors are in the 
model. The choice of predictor to be added to the model at each step depends on which among the remaining 
predictors gives the maximum additional improvement to the fit (Deviance). The single best model is chosen 
using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). AIC and BIC are 
computed as follows: 


AIC = —21 + 2p, 
BIC = —21 + plog(n), 


where p is the number of predictors, n is the number of observations, and l is the log-likelihood. 


2.3. Prediction metrics 

A confusion matrix contains information about actual and predicted classifications done by a 
classification system. Table 1 shows the confusion matrix for a two-class classifier. The true positive rate 
(TPR) is given by the ratio of the true positive (TP) to the sum of the true positive and false negative (FN). 
The false positive rate (FPR) is obtained as the ratio of the false positive (FP) to the sum of the false positive 
and true negative (TN). 


Table 1. Confusion Matrix 








Predicted Class Total Rate 
Y=1 Y=0 
Y=1 TPR = 
Actual Class TP FN TP + FN TP + FN 
FP 
Y=0 FP TN FP+TN FPR = ————_ 
FP+TN 





At different thresholds says 0.00, 0.01,...,1.00, a graph of TPR (sensitivity) against FPR (1- specificity) is 
plotted to obtain the receiver operating characteristic (ROC) curve. Numerically, the area under the ROC 
curve also referred to as the area under curve (AUC). The AUC measures the prediction accuracy of the logit 
model. 
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2.4. C-Index 
Based on our data, (i,j) are the actual response observations labelled 1 and 0 while 2(8|X;) and 
n(B |X, 7) are the corresponding predicted probabilities. For each pair(i, j), the C-Index is defined as: 


Cj = Pr[x(BIX) > n(BIX)IY; = LY; = 0]. 


Equivalently, 
Ci; = Pr[XiB > XBY; = LY; = 0]. 
Let, 
m= Xip = 1, 
0 1 
nj” = X;pIYý = 0, 
N, = Number of 1’s, and 
No = Number of 0's. 


For all pairs (i, j), the C-Index is estimated by the test statistic: 


—__1 N1 yNo (1) „(0) 
OST aie ralni Nj }, 





where 


aha Sy 


U 
0 : 0 
n,n = 40.5, ifn = ní a 


NGT 0 
0, if ni ) <ni ), 


C < 0.5 Implies poor prediction accuracy by the model. If C = 0.5, then the model prediction accuracy is 
the same as a random guess. When C > 0.5 and C = 1 are indications of better and perfect model prediction 
accuracies respectively. 


3. RESULTS AND ANALYSIS 
3.1. Model assessment metrics 

This study used the GYTS data to specify a model for predicting the likelihood of smoking at the 
junior high school level in Ghana. Table 2 presents results on the model assessment metrics using both the 
training and the testing sets. A full logit model with 32 predictors and two selected logit models based on 
forward selection with the least AIC/BIC were estimated and compared using C-index and C-Index per 
Predictor. The model with the least AIC of 2029.06 chose 14 predictors while the model with the least BIC of 
2086 resulted in 6 predictors shown in Appendix. 

The C-Index and the C-Index per Predictor were used to check the predictive accuracies of the full 
model and the two selected models. The C-Index for the full model (85.15%) was larger than that of the 
AIC-based, and BIC-based models in the training set. The C-Indexes for the AIC-based and BIC-based 
models on the training set were 84.65% and 82.00% respectively. Similarly, the C-Index for the full model 
(80.67%) was larger than that of the AIC-based and BIC-based models in the testing set. The C-Indexes for 
the AIC-based and BIC-based models on the testing set were 81.74% and 80.86% respectively shown in 
Table 1. The C-Index per Predictor for the full model, AIC-based, and BIC-based models on the training set 
were 2.67%, 6.05% and 13.67% respectively. Similarly, the C-Indexes per Predictor for the three models on 
the testing set were 2.52%, 5.84%, and 13.48% respectively shown in Table 2. 


Table 2. Model Assessment Metrics 





Description Full Model AIC-Model BIC-Model 
# of Predictors Selected 32 14 6 
C-Index - Training Set (%) 85.51 84.65 82.00 
C-Index - Testing Set (%) 80.67 81.74 80.86 
C-Index Per Predictor — Training Set (%) 2.67 6.05 13.67 
C-Index Per Predictor — Testing Set (%) 2.52 5.84 13.48 
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A comparison of the three C-Indexes per predictor indicated that the BIC-based model had the 
highest power per predictor, followed by the AIC-based model both in the training and testing sets. Often, 
model assessment metrics based on training sets tend to lead to overfitting. Therefore, testing sets are used to 
choose the most appropriate model. The chosen model would be based on the testing set with C-Index per 
Predictor and parsimony. The BIC-based model fit these criteria. 


3.2. ROC curves of the Full, AIC, and the BIC-based models 

Figure 1 shows the ROC curves of the Full (left), AIC (middle) and the BIC (right)-based models. 
Figure 1 compares the AUC for the training (grey) and the testing (black) sets of the three models. 
Essentially, the AUC is numerically the same as the C-Index. 
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Figure 1. ROC Curves for Full (left), AIC (middle), and BIC (right)-Based Models 


3.3. Predictors of smoking tendencies 

Table 3 presents the fitted BIC-Model with six (6) predictors that resulted in the highest C-Index per 
predictor. The six predictor variables significantly predicted students’ smoking tendencies. First, where 
students smoked significantly predicted their smoking tendencies. Participants who “Never smoked 
cigarettes” served as the comparison or reference group. Participants who smoked at home, school, at a 
friend’s house, at public events or other places, had significantly higher odds of smoking than those who 
indicated they did not smoke. Second, having their closest friend smoke was a significant predictor of 
participants’ smoking tendencies. Students who indicated none of their closest friends smoked served as the 
comparison group. Participants who indicated some, most or all of their friends smoked had significantly 
higher odds of smoking than those who had zero friends who smoked. Third, the “number of days” people 
have smoked in students’ homes in thestudents’ presence significantly predicted their tendency to smoke. 
Students who indicated they had zero people smoke in their house in the past 7 days served as the 
comparison group. Those who indicated 1-2, 3-4, 5-6 or 7 people smoked in their house the past 7 days had 
significantly higher odds of smoking. Fourth, having chewed any form of tobacco products other than 
cigarettes in the past 30 days significantly predicted students’ smoking tendencies. Students who indicated 
they ever chewed any form of tobacco products other than cigarettes in the past month had higher odds of 
smoking cigarettes than those who did not. Fifth, a student’s sex was a significant predictor of the tendency 
to smoke. Females had a lower odd of smoking cigarettes than their male counterparts. Finally, whether or 
not discussing the effects of smoking in JHS classes about the effects of smoking as a predictor of smoking 
tendencies resulted in mixed results. Students who responded “Yes” to discussing about the effects of 
smoking in their classes served as the comparison group. Students who responded “No” had significantly 
lower odds of smoking cigarettes. There were no significant differences in the tendencies to smoke cigarettes 
for those who responded “yes” and those who indicated they were not sure. 
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Table 3. Fitted Bayesian Information Criterion-Model 








Variable z-Score Odds Ratio Probability 
(Intercept) -11.892 0.116 0.104 
Where do you usually smoke? (Never smoked) 
Home 12.273 9.915 0.908 
School 3.637 2.955 0.747 
Work 5.091 19.631 0.952 
Friends’ House 8.744 26.655 0.964 
Social Events 4.097 94.341 0.990 
Public Places 7.471 13.009 0.929 
Other 5.022 27.173 0.965 
Do any of your closest friends smoke cigarettes? (None) 
Some 5.611 2.257 0.693 
Most 2.008 1.857 0.650 
All 3.627 3.547 0.780 


During the past 7 days, on how days have many people have smoked in your home, 
in your presence? (0) 


1-2 5.882 2.681 0.728 
3-4 2.706 2.354 0.702 
5-6 3.221 2.663 0.727 
5-7 3.324 2.225 0.690 


During the past 30 days (month) have you ever chewed any form of tobacco 
products other than cigarettes? (Yes) 


No -4.202 0.512 0.338 
Sex (Male) 
Female -3.476 0.650 0.394 


During this school year, did you discuss in any of your classes about the effects of 
smoking? (Yes) 
No -3.467 0.642 0.391 
Not Sure 1.312 1.309 0.567 
|z| > 1.64 means statistically significant at the 10% level 
|z| > 1.96 means statistically significant at the 5% level 
|z| > 2.58 means statistically significant at the 1% level 





4. DISCUSSION 


The present study used the GYTS 2009 data to specify a model for predicting the likelihood of 
smoking at the junior high school level in Ghana. A major finding of the study was that where students 
smoked significantly predicted their smoking tendencies-the home, school, a friend’s house, and public 
events are environments that would increase the likelihood that JHS students would smoke. For example, 
parental smoking behaviors are important in reducing smoking intentions among Ghanaian youth [13]. 
Adolescents with parents who smoke at home would more likely have access to tobacco at home [18]. It is 
also a common practice for parents and other adults in Ghana to send their children or other minors to 
purchase tobacco products for them (parents/other adults). A related finding was that the number of people 
who smoked in students’ homes in their presence significantly predicted students’ likelihood to smoke. 
Tobacco control measures in Ghana are specified in the Public Health Act 2012 (Act 851) [19]-the Act 
prohibits smoking in public places among other restrictions. Unfortunately, lack of enforcement of the Act 
undermines Government and community efforts to control the negative effects of tobacco use [20]. 

A second finding was that students who smoked were more likely to have closest friends who also 
smoked. This finding is consistent with previous findings that smokers were likely to have friends who 
smoked [15]. Peer influence, according to [21], was a risk factor for smoking. Students who had been offered 
a cigarette were more likely to smoke. In addition, those with one or more friends had higher odds to smoke 
than those who had no friends [21]. Thus, strategies for increasing adolescents’ self-esteem and decision- 
making capabilities are critical for smoking prevention in Ghana. As noted by [22], adolescents with high 
levels of self-esteem would be less susceptible to peer pressure than their counterparts with low levels of self- 
esteem. 

Third, the use of any form of tobacco products other than cigarettes predicted students’ smoking 
tendencies. Research has shown that individuals who chewed other forms of tobacco products in the past 
month had higher odds of smoking cigarettes than those who did not. Smokeless tobacco is a significant 
predictor of smoking initiation among young adult males. Participants who used smokeless tobacco were 
more likely to begin smoking that nonusers [23]. Furthermore, smokeless tobacco has been associated with 
health risks such as cancer causing substances [24] and adverse cardiovascular health [25], and it is not a 
feasible substitute for cigarette smoking [24]. 

Finally, male students had higher prevalence of smoking than their female counterparts. This finding 
is consistent with that of [20], who reported higher smoking tendencies among Ghanaian adult males than in 
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females. Thus, early smoking intervention programs among school-aged adolescents is warranted as bad 
habits are difficult to break. This would promote healthy lifestyles choices among students such as the 
avoidance of alcohol and tobacco consumption [26]. 

Findings from the present study have implications for school and community health. First, school 
smoking prevention programs need to include parents and community leaders. Second, students’ self-esteem 
and decision making capabilities must be addressed in school intervention programs aimed at smoking 
prevention. Students with high self-esteem would more likely resist smoking uptake due to peer 
pressure [22]. In addition, students need the skills to help them make good decisions that would lead to 
healthy lifestyles. Finally, the data used for the present study were the most current on tobacco survey data 
for Ghana, and were published in 2009. Therefore, it is incumbent on Ghana’s stakeholders in the health 
sector to implement the GYTS to gain insight on the most current smoking behaviors among school-going 
aged adolescents in the country. 

The current study utilized a closed-ended survey (GYTS) to examine junior high school students’ 
smoking tendencies. Future research could use other approaches to gain futher insight on smoking among 
adolescents. First, future researchers should include qualitative approaches such as focus groups to allow 
participants to freely express their feelings. This would provide better understanding of students’ tendencies 
toward smoking. Second, the present study provided aggregated national data. Future research using data 
disaggregated by region would reveal regional disparities, if any, relating to JHS students’ smoking 
tendencies. This would allow policy makers and health and school administrators to target region specific 
predictors of tendiencies for effective smoking prevention interventions. Finally, future researchers should 
include multi-age populations including junior high, senior high schools and colleges and universities. 
Results of such studies would provide a comprehensive information on the prevalence of smoking for the 
respective educational levels in the country. 


5. CONCLUSIONS 

The present study showed that environmental issues were important factors that impacted junior 
high school students’ smoking tendencies. The home, school, a friend’s house, and public events were 
environments that significantly predicted smoking behaviors. In addition, social interactions, especially, with 
peers are significant determinants of students’ smoking behaviors. Furthermore, the study revealed that the 
Logit model was a powerful statistical tool for identifying predictors of Ghanaian junior high school 
students’ smoking behaviors. 
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